An Application of Approximated Entropy Measures in Decision Tree Induction

نویسندگان

  • Sinh Hoa Nguyen
  • Hung Son Nguyen
چکیده

The main task in decision tree construction algorithms is to find the ”best partition” of set of objects. We consider the problem of searching for optimal binary partition of continuous attribute domain for large data sets stored in relational data bases (RDB). The straightforward approach to optimal partition selection with respect to entropy measure (which evaluates the quality of a partition) needs O(N) simple queries, where N is the number of pre-assumed partitions. We present new approximated entropy measures that allow to construct the partition very close to optimal, using only O(logN) simple queries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing different stopping criteria for fuzzy decision tree induction through IDFID3

Fuzzy Decision Tree (FDT) classifiers combine decision trees with approximate reasoning offered by fuzzy representation to deal with language and measurement uncertainties. When a FDT induction algorithm utilizes stopping criteria for early stopping of the tree's growth, threshold values of stopping criteria will control the number of nodes. Finding a proper threshold value for a stopping crite...

متن کامل

Application of Different Methods of Decision Tree Algorithm for Mapping Rangeland Using Satellite Imagery (Case Study: Doviraj Catchment in Ilam Province)

Using satellite imagery for the study of Earth's resources is attended by manyresearchers. In fact, the various phenomena have different spectral response inelectromagnetic radiation. One major application of satellite data is the classification ofland cover. In recent years, a number of classification algorithms have been developed forclassification of remote sensing data. One of the most nota...

متن کامل

INFORMATION MEASURES BASED TOPSIS METHOD FOR MULTICRITERIA DECISION MAKING PROBLEM IN INTUITIONISTIC FUZZY ENVIRONMENT

In the fuzzy set theory, information  measures play a paramount role in several areas such as decision making, pattern recognition etc. In this paper, similarity measure based on cosine function and entropy measures based on logarithmic function for IFSs are proposed. Comparisons of proposed similarity and entropy measures with the existing ones are listed. Numerical results limpidly betoken th...

متن کامل

SHAPLEY FUNCTION BASED INTERVAL-VALUED INTUITIONISTIC FUZZY VIKOR TECHNIQUE FOR CORRELATIVE MULTI-CRITERIA DECISION MAKING PROBLEMS

Interval-valued intuitionistic fuzzy set (IVIFS) has developed to cope with the uncertainty of imprecise human thinking. In the present communication, new entropy and similarity measures for IVIFSs based on exponential function are presented and compared with the existing measures. Numerical results reveal that the proposed information measures attain the higher association with the existing me...

متن کامل

Comparison of Two Families of Entropy-based Classification Measures with and without Feature Selection

Many decision tree (DT) induction algorithms, including the popular C4.5 family, are based on the Conditional Entropy (CE) measure family. An interesting question involves the relative performance of other entropy measure families such as Class-Attribute Mutual Information (CAMI). We therefore conducted a theoretical analysis of the CAMI family that enabled us to expose relationships with CE an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002